Submodular problems - approximations and 

algorithms 



Dorit S. Hochbaum * 
email: hochbaumOieor . berkeley . edu 

University of California, Berkeley 



Abstract. We show that any submodular minimization (SM) problem 
denned on linear constraint set with constraints having up to two vari- 
ables per inequality, are 2-approximable in polynomial time. If the con- 
straints are monotone (the two variables appear with opposite sign coef- 
ficients) then the problems of submodular minimization or supermodular 
maximization are polynomial time solvable. The key idea is to link these 
problems to a submodular s, i-cut problem defined here. This framework 
includes the problems: SM- vertex cover; SM-2SAT; SM-min satisfiability; 
SM-edge deletion for clique, SM-node deletion for biclique and others. 
We also introduce here the submodular closure problem and and show 
that it is solvable in polynomial time and equivalent to the submodu- 
lar cut problem. All the results are extendible to multi-set where each 
element of a set may appear with a multiplicity greater than 1. For all 
these NP-hard problems 2-approximations are the best possible in the 
sense that a better approximation factor cannot be achieved in polyno- 
mial time unless NP=P. The mechanism creates a relaxed "monotone" 
problem, solved as a submodular closure problem, the solution to which 
is mapped to a half integral super-optimal solution to the original prob- 
lem. That half-integral solution has the persistency property meaning 
that integer valued variables retain their value in an optimal solution. 
This permits to delete the integer valued variables, and restrict the search 
of an optimal solution to the smaller set of remaining variables. 



1 Introduction 

Let V be a finite nonempty set of cardinality n. A nonnegative function / defined 
on the subsets of V is said to be submodular if it satisfies for all X, Y C V, 



f(X) + f(Y) > f(X n Y) + f(X U Y). 

We consider here submodular minimization on linear constraints. For any 
binary vector x = {xi}™ =1 the corresponding set X is the characteristic set of x. 
Namely, X = {i\xi = 1}. The problem of submodular minimization (SM) on m 
linear constraints, for an m x n matrix A and a vector b is, 
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min f{X) 
(SM) subject to Ax > b 

Xj binary j G V. 

A set X is said to be feasible, if the corresponding binary vector x satisfies 
Ax > b. Submodular function / is said to be monotone if f(S) < f(T) for any 
S C T, and normalized if /(0) = 0. In all problems studied here the submodular 
functions are monotone and normalized. 

Submodular minimization on multi-sets allows the multi-sets to contain el- 
ements with multiplicity larger than 1. So for an integer, nonnegative, vector 
x, the corresponding multiset is the collection of pairs X = {(i,qi)\xi = qi} 
meaning that the set X contains element i qi times. All properties of submod- 
ular functions extend easily to multi-sets, with the definition of containment, 
X\ C X2 indicating that for all (i,qi) G X\, {i,q[) G Xi with qi < q[. 

The submodular multi-set minimization problem is then, 

min f(X) 
subject to Ax > b 

< Xj < Uj integer j G V. 

Submodular optimization problems are only harder to optimize, or approxi- 
mate, than their linear optimization counter-parts since linear functions are also 
submodular. Nevertheless, it is demonstrated here that all polynomial time 2- 
approximation algorithms for NP-hard linear optimization in integers over con- 
straints with up to two variables per inequality, extend to 2-approximations 
for the respective submodular problems, also running in polynomial time. The 
complexity of the linear problem is the running time for a minimum cut in a 
respective graph, whereas for the submodular problem this is substituted by the 
run time of the submodular cut problem, SM-cut, introduced here. 

A solution corresponding to a cut is associated with a partition of V to a 
source set and sink set. The source sets corresponding to cuts form a ring since 
their union and intersection are also source sets of cuts. Submodular minimiza- 
tion over a ring family, or over all subsets, was shown first to be solved in strongly 
polynomial time by Grotschel, Lovasz, and Schrijvcr in [GLS88]. Combinatorial, 
strongly polynomial algorithms were given later by Schrijver, [SchOO] and by 
Iwata, Fleischer, and Fujishige [IFF01]. The current fastest strongly polynomial 
algorithm on a ring family was given by Orlin [Or09], and later by Iwata and 
Orlin, [IO09] . The class of submodular minimization problems presented here is 
therefore of equivalent complexity to that of the respective class of linear mini- 
mization problems in that the routine of minimum s, i-cut on an associated graph 
employed in the linear case, is replaced by the SM-cut on the same constructed 
graph for the respective SM problem. 

There is a large body of work on approximations in general, and 2- approxi- 
mations in particular, which is not reviewed here. Instead we only address a 
couple of papers within the approximation literature. The technique we use here 
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is an extension of that of Hochbaum et al. [HMNT93] for linear integer minimiza- 
tion on two variables per inequality. Using the local-ratio technique Bar- Yehuda 
and Rawitz [BR01] offered an alternative approach for 2-approximating such 
problems. This has the advantage when UjS are very large. The construction of 
the graph here and in [HMNT93] affects the number of arcs in the constructed 
graph in a quadratic factor of Uj whereas the local-ratio algorithm is affected by 
only a linear factor. The algorithm we use for solving the SM-cut problem, of 
[IN09], is not affected by the number of arcs, only by the number of nodes. Hence 
this complexity improvement does not extend to the SM case. That approach 
also does not lend itself to an extension to the submodular minimization case 
in that it does not have the persistency property that the half integral super- 
optimal solution obtained here and in [HMNT93] has. Koufogiannakis and Young 
[KY09] devised approximations for SM- "covering" problems based on the fre- 
quency technique (called maximal dual feasible technique in [Hoc97] Ch. 3). This 
applies to submodular minimization on "covering-type" matrices with nonnega- 
tive entries with "monotone" property. That "monotone" property is unrelated 
to the monotone constraints addressed here. 

The technique described here applies to a large class of NP-hard submod- 
ular optimization problems including: The submodular minimizations of vertex 
cover; minimum 2-SAT; minimum node deletion biclique; minimum edge dele- 
tion clique; min SAT; and any submodular optimization on constraints each 
including up to two variables. 

Two new polynomial time submodular optimization problems are introduced 
here. One is the submodular s, i-cut problem and the other is the closely related 
(and shown equivalent) submodular closure problem which is to maximize (or 
to minimize) the sum of supermodular revenues minus submodular costs (or 
minimize submodular costs minus supermodular revenues). The algorithm for 
solving the submodular cut problem is the key subroutine in solving all the 
other problems presented here. 

One indication that the submodular case is possibly harder to approximate 
than the linear case has been established for the vertex cover problem: Hochbaum 
conjectured in [Hoc83] that the lower bound on approximability of the vertex 
cover problem is 2 — e. The tightest lower bound established to date on the 
approximability of vertex cover is 1.36 (Dinur and safra [DS02]), whereas for the 
submodular monotone analog it is 2 — e (Goel et al. [GKTW09])). The latter 
closes the gap between the lower bound and the factor 2 approximation, as 
conjectured in [Hoc83], for submodular vertex cover. The results presented here 
mean that for all NP-hard submodular minimization problems on two variables 
per inequality, there is a polynomial time 2-approximation algorithm, which 
cannot be improved. This is because the vertex cover problem is as general as 
the entire class of these problems (as shown in Section 5.3). 
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2 Notations and preliminaries 

Given an integer matrix A rnxn so that each row contains at most two non-zeroes. 
If the two non-zeroes are of opposite signs then we call the matrix monotone. 
We define the class of submodular minimization problems on constraints with 
at most two variables per constraint, SM2: 

min f(X) 
(SM2) subject to Ax > b 

< Xj < uj integer j G V. 

Problem SM2 is said to be monotone if the corresponding matrix A is mono- 
tone. If all Uj = lwe call the problem a binary SM2. 

For a directed graph G = (V,A) and B,D C V, we denote by (B,D) the set 
of arcs from nodes in B to nodes in D, (B, D) — {(i, G B, j G D}. Note that 
B and D need not be disjoint and can be equal. In G = (V, A), a set of nodes 
D C V is said to be closed if all the successors of the nodes in D are also in D. 
In other words, the transitive closure of D, forming all the nodes reachable from 
nodes of D along a directed path in G, is equal to D. 

An arc-capacitated graph G s t = (V U {s, t}, A U A s U A t ) with A s the set of 
arcs adjacent to source s and A t a set of arcs adjacent to sink t, is said to be a 
closure graph if all arcs in A have infinite capacity. 

A function / is said to be supermodular if for all X,Y CV, 

f(X) + f(Y)<f(XDY) + f(XUY). 

3 Examples of submodular minimization problems solved 
here 

The submodular vertex cover problem has been shown recently to have a 2- 
approximation, [IN09]. Here we describe a 2-approximation algorithm which ap- 
plies to the entire family of submodular minimization (and supermodular max- 
imization) with constraints with up to two variables each. The general purpose 
method used is an extension of the algorithm used by Hochbaum [Hoc83] for 
(linear) vertex cover, and later generalized by Hochbaum ct al. [HMNT93] to 
all integer linear minimization on two variables per inequality, and later still by 
[Hoc02] to a restricted class of integer linear minimization on constraints with 
up to three variables per inequality. 

One reason why the generalization works, is that the technique used, of mono- 
tonizing and binarizing, is a dual method manipulates constraints and is indepen- 
dent of the objective function. This also explains why the maximum frequency 
approximation to set cover can be extended to submodular set cover with the 
same approximation guarantee [KY09,IN09], whereas the greedy approximation 
algorithm, which is a primal approach, cannot be extended to the submodular 
version of the set cover, [IN09]. 
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Vertex cover. The vertex cover problem is to find a subset of nodes in a 
graph G = (V, E) so that each edge in E has at least one endpoint in the subset. 

min f(X) 
(SM-vertex-cover) subject to Xi + Xj > 1 for all e E 

Xi binary i £ V. 

Complement of maximum clique. The maximum clique problem is a well 
known optimization problem that is notoriously hard to approximate as shown 
by Hastad, [Ha96]. The problem is to find in a graph the largest set of nodes 
that forms a clique - a complete graph. 

An equivalent statement of the clique problem is to find the complete sub- 
graph which maximizes the number (or more generally, sum of weights) of the 
edges in the subgraph. When the weight of each edge is 1, then there is a clique 
of size k if and only if there is a clique on (*) edges. The inapproximability result 
for the node version extends trivially to this edge version as well. 

The complement of this edge variant of the maximum clique problem is to 
find a minimum weight of edges to delete so the remaining subgraph induced 
on the non-isolated nodes is a clique. We define here the SM-edge deletion for 
clique. For a graph G — (V, E), the submodular function f(Z) is defined on the 
set of variables Zij for all edges e E. Let Xj be a variable that is 1 if node 
j is in the clique, and otherwise. Let z^ be 1 if edge 6 E is deleted. 

min f(Z) 

subject to 1 — Xi < 6 E 

(SM-Clique-edge-delete) l-x 3 <z l3 ihj}&E 

Xi+Xj <1 £ E 

Xj binary j E V 
binary e E. 

This formulation has two variables per inequality and therefore 2-approximation 
follows immediately. The gadget and network for solving the monotonized SM- 
Clique-edge-delete problem are given in detail in [Hoc02]. 

Node deletion biclique. Here we consider the submodular minimization of 
node deletion in a bipartite graph so that the remaining nodes form a biclique 
(a complete bipartite graph.) This problem is in fact polynomial time solvable 
since, with the "monotonizing" transformation, an equivalent set of constraints 
is monotone. In the formulation given below Xi assumes the value 1 if node i is 
deleted from the bipartite graph, and otherwise. 

min f(X) 

(SM-Biclique-node-delete) subject to Xi + Xj > 1 for edge {i,j} E i £ Vi, j G Vi 

Xj e {0, 1} for all j e V. 

Notice that this formulation is identical to that of the submodular vertex 
cover on a bipartite graph, and both are polynomial time solvable. Note also 
that the SM-node deletion clique is the same as SM- vertex cover and thus NP- 
hard, and has a polynomial time 2-approximation. 
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Minimum satisfiability. In the problem of minimum satisfiability, MIN- 
SAT, we are given a CNF satisfiability formula. The aim is to find an assignment 
satisfying the smallest number of clauses, or the smallest weight collection of 
clauses. The MINSAT problem was introduced by Kohli et. al. [KKM94] and 
was further studied by Marathe and Ravi [MR96]. The problem is NP-hard. 

To see that the submodular minimum satisfiability SM- MINSAT problem can 
be formulated as SM2, and thus 2-approximable, we choose a binary variable %jj 
for each clause Cj and xi for each literal. Let S + (j) be the set of variables that 
appear unnegated and S~(j) those that are negated in clause Cj. The following 
formulation of MINSAT has two variables per inequality and is thus a special 
case of SM2: 



min f(Y) 

(SM- MINSAT) sub J ect t0 Vj ^ x * for i e S+ U) for clause Cj 
^ ' yj > 1 — Xi for i e S~(j) for clause Cj 

,Uj binary for al\i,j. 



It is interesting to note that the formulation is monotone when for all clauses Cj 
S + (j) — or in all clauses S~(j) = 0. (In the latter case need to transform the x 
variables to x' with x' — —x.) Indeed in these instances the boolean expression 
is uniform and the problem is trivially solved setting all variables to FALSE in 
the first case, or to TRUE in the latter case. 

MIN-2SAT. The MIN-2SAT problem is defined for a 2SAT CNF with each 
clause containing at most two variables. The goal is to find a least weight col- 
lection of variables that are set to true, so that the respective 2SAT CNF is 
satisfied. Although finding a satisfying assignment to a 2SAT can be done in 
polynomial time, Even et al. [EIS76], finding an assignment that minimizes the 
number, or the weight, of the true variables in NP-hard. 

Let X be the set of true variables, and Xi = 1 if the ith variable is set to 
true, and otherwise. 

min f(X) 
subject to Xi + Xj > 1 for clause (xi V Xj ) 
(SM-MIN-2SAT) x t - Xj > for clause (x 4 V x-) 

Xi + Xj < 1 for clause (x~i V x~j ) 
Xi binary for all i = 1, . . . , n. 



Each constraint here has up to two variables and thus this problem is in the 
class SM2. Consequently we get for this problem a 2-approximation in polyno- 
mial time. The construction of the graph is given in [Hoc02]. 

Additional problems related to finding maximum biclique - a clique in a bi- 
partite graph - are also formulated in two variables per inequality in [Hoc98]. 
Therefore all the corresponding submodular minimization problems are also 
cither solved in polynomial time, if monotone, or have a polynomial time 2- 
approximation. 
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4 The submodular closure problem 

The submodular closure problem is defined on a directed graph G = (V, A), a 
partition of the set of nodes V = V + U V ~ , a supermodular function fi () defined 
on subsets of V + and a submodular function /~() defined on subsets of V~ . 
The submodular closure problem is, 

(SM-closurc) max f^D n V+) - f~(D n V~). 
dcv,d closed 

In minimization form the SM-closure problem is ram DCV D c i ose( j/ ~{D H 
V~) — fi(D n V + ) Notice that the SM-closure problem involves both a super- 
modular and a submodular functions in the objective. The problem of submod- 
ular optimal closure is a generalization of the (linear) closure problem defined 
on a directed graph G = (V, A). The closure requirement is represented as a set 
of monotone constraints: 

(max-closure) max J2jev w j x j 

subject to Xi — Xj > V(i,j) € A, 

xj binary j G V. 

The linear max-closure problem induces a partition on V with V + = {v e 
V\w v > 0} and V~ = {v e V\w v < 0}. A non-binary integer version of the 
minimum closure problem with a convex separable objective replacing the linear 
objective was shown to be solvable in polynomial time by a parametric cut 
algorithm in Hochbaum and Queyrannc [HQ03]. The SM-closure problem is 
shown next to be equivalent to a submodular cut problem, SM-cut. 



I 




Fig. 1. The submodular closure problem on a bipartite closure graph. 
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For a graph G — ( V, A) and a partition of V to V + U V~ , we construct a 
corresponding s,t graph, G stl by adding source and sink nodes, s and t, and 
connecting s to nodes of V + C V with a set of arcs A s = {(s, i)\i e V + }, and 
connecting t to nodes if V~ = V \ V + with a set of arcs A t = {(j,t)\j 6 V~}. 
G st = (V U {s,t},AU A S U A t ). A cut is a partition of the set of nodes V to S 
and 5 = V \ S with {s} U S called the source set of the cut, and {t} U S called 
the sink set of the cut. A cut is said to be finite if (S, S) — 0, or equivalently, S 
is closed in G. Given two submodular functions / + and / _ defined on V + and 
V~ respectively, we define the submodular cut problem, SM-cut, on the graph 
G st as follows, 

(SM-cut) min f + {D n V + ) + f~(D H V~). 

dcv.d closed 

This definition is analogous to that of the minimum s, t-cut problem on a closure 
graph. Note that for a finite cut ({s} U 5, 5U {£}), the only arcs that participate 
in the cut are arcs ({s}, Sf) V + ) and (5 PI V~ , {t}). The connection between the 
submodular closure and SM-cut problems is established next: 

Theorem 1 The optimal solution to the SM-closure problem is also optimal for 
a respective SM-cut problem. 

Proof: Since the function /i() is supcrmodular on subsets of V + then for any 
constant C the function f'(B) = C — f\{B n V + ) is submodular on subsets of 
V + . (We'll choose C large enough so the resulting function assumes nonnegative 
values.) To see that let Bi,B 2 C V + , 

f(B 1 ) + f(B 2 ) =2C- n v+) + h(B 2 n v+)}. 

Since /i() is supermodular, 

n v+) + h{B~ 2 n v+) < hdS, u b 2 ) n v+) + hdS, n B~ 2 ) n v+) 

= C- f'(B 1 HB 2 )+C- f'(B 1 U B 2 ). 

Thus, /'(Si) + f'(B 2 ) > f'(B 1 U B 2 ) + f'(Bi n B 2 ), and /'() is submodular 
as claimed. Now, 

^dcv,d closed A(£ n ^ + ) - /" n V-) 
= ^dcv.d closed C - f'(D n V+) - f~(D n V - ) 
= C - min^^ dosed /'(^ n ^ + ) + /"(^ n V - ). 

The latter minimization problem is the SM-cut problem. For the constant 
C = fi(V + ) we denote the submodular function /'() by / + (). We thus showed 
that an optimal solution to the SM-closure problem is also optimal for the SM- 
cut problem. 

□ 

In Figure 1 we illustrate the SM-cut problem for a bipartite instance of G, 
where each arc adjacent to source or sink is shown with the respective singleton 
node function value. 



9 



As noted in the introduction, the feasible sets for SM-cut form a ring as the 
intersection and union of any source sets of two closed sets is also a closed set. 
As such the SM-cut is solved in polynomial time as submodular minimization 
on a ring. 

5 Solving submodular integer programs SM2 

Here we show that any monotone SM2 problem is equivalent to a SM-closure 
problem on a graph on number of nodes which is 0(^ ieV u{). Note that if the 
range of the variables is not a polynomial quantity then the size of the graph 
is pseudopolynomial. This run time however cannot be made polynomial, unless 
NP=P, since solving even monotone integer programs on constraints with up 
to two variables per inequality is NP-hard, [Lag85] . (The pseudopolynomial run 
time of the algorithm for monotone constraints in [HN94] indicates that the 
problem is weakly NP-hard.) 

The algorithm for solving a general SM2 is to transform such problem to a 
monotone SM2, a process we refer to as monotonizing, and then transform it 
to a problem with binary coefficients, referred to as binarizing. Notice that a 
problem can be monotonized first, and then binarized, or the other way around 
- these two processes are commutative. The technique of binarizing a mono- 
tone system of constraints was introduced by Hochbaum and Naor in [HN94]. 
The concept of monotonizing was introduced in [HMNT93]. Given a monotone 
system of constraints, the binarizing process constructs a closure graph. A min- 
imum SM-closed set in that graph is also an optimal solution to the SM-integer 
minimization on that set of constraints. 

5.1 Binarizing 

The process of binarizing transforms the constraints into an equivalent set of 
MIN-2SAT constraint. We replace the n original variables and m original con- 
straints by u = Y^j=i u j new variables and at most mU + u new constraints, 
where U = max^ m. 

Each variable Xi is substituted by Ui binary variables xu (£ — 
and the added constraints xu > Xi^ + \ {£ — l,...,Ui — 1). Subject to these 
constraints, the correspondence between x t and the u^-tuple (xn, . . . , Xi Ui ) is 
one-to-one and is characterized by x& = 1 if and only if Xi > £ (£ — 1, . . . ,Ui), 
or, equivalcntly, Xi = X^=Li x ^- This construction is then represented as a part 
of a closure graph, where for each variable Xi , there is a chain of infinite capacity 
arcs from the node representing Xi^+i to the one representing xu. 

We now explain how to transform the constraints of the given system into 
constraints in terms of the binary variables x^s. For a general constraint of the 
form, atiXi + atjXj > bk, consider the case where both are positive, and 
assume without loss of generality that < bk < akiUi + a^jiij. The other cases 
where one is negative (and the constraint is monotone), or both are negative, 
are similarly binarized. 
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For every I (£ = 0, . . . , Ui), let aki = bk ~ l k aki — 1 • For any integer solution 
x, akiXi + akjXj > bk if and only if for every I (I = 0, . . . , Ui — 1), 

either Xi > I or Xj > aki, 

or, equivalents ^ > ^ + x Qr ^ > afe , + i , 

which can be written as ^ -, 

+ Xj. ak , + 1 > i • 

Obviously, if olm > u j, then we fix the variable, Xi^ + \ = 0. 

If the above transformation is applied to a monotone system of inequalities, 
then the resulting 2-SAT integer program is also monotone. Thus, altogether we 
have replaced one original constraint on Xi and Xj by at most Ui + 1 constraints 
on the variables xu and Xjg. The other cases, corresponding to different sign 
combinations of aki, a kj, and bk, can be handled in a similar way. We thus 
showed, 

Theorem 2 The set of SM2 constraints is equivalent to the constraints of SM- 
MIN-2SAT on at most nU binary variables and mU + u constraints. 



5.2 Monotonizing and constructing an equivalent SM-closure 
problem 

With Theorem 2 we can now assume that the SM2 problem is given as SM-MIN- 
2SAT problem. The only constraints in SM-MIN-2SAT (other than the binary 
requirements) are of the form x,- L + Xj > 1 or Xi + Xj < 1 or Xi — Xj > 0. 

We replace each variable x by two variables, x + e {0, 1} and x~ G { — 1,0}. 
The nonmonotone inequality Xi + Xj > 1 is then replaced by two monotone 
inequalities: 

x+ - xj > 1 
-x- + x+ > 1 . 

The inequality — xi — Xj < 1 is replaced by, 

-xj + xj < 1 

XT - xj < 1 . 

And a monotone inequality, Xi — Xj > is replaced by: 

xj - xj > 

-XT + Xj > 0. 

We now construct a closure graph for solving the monotone problem: There 
is a node for each variable, and the nodes corresponding to variables in x + form 
the set V + and those corresponding to variables in x~ form the set V~ . A node 
in V + or V~ is selected in the closed set if and only if the corresponding value 
is xj = 1 or xj = 0, respectively. For a constraint xj — xj > 1 there is an 
arc of infinite capacity (xj ,xj). This ensures that if x J = and xj is in the 



11 



closed set, then so is x j, and thus Xj = 1. Similarly we construct for constraint 
x\ — xj < 1 infinite capacity arc (xf ,xj), and for x\ — > an infinite 
capacity arc (x~j,xf). We thus proved: 

Lemma 1. The set {j G V + \x^ = 1} U {j G V~\x~ = 0} is closed in the 
constructed graph if and only if the solution x + and x~ is feasible for the set of 
monotone inequalities. 

With the notation X + = {i\x~l = 1} and X~ = {i\x^ = —1} we set the 
objective function of this relaxed (monotonized binarized) SM2 to min f(X + ) + 
f(X~) where the sets X + and X~ are subsets of the respective copies of V, 
V + and V~. The closed set constraints correspond to the requirement that 
Dx = X + U (V - \ X~) is closed. Therefore, solving the monotonized problem 
for x+ , x~ is equivalent to solving the respective SM-cut problem, 

min f(D x nV+) + f(D x nV-) 
d x cv,d x closed 

Notice that this problem is equivalent to solving the min SM-cut in the 
reverse graph, or, we could have defined a closed set as a set containing all its 
predecessors. We conclude that we can solve in polynomial time the relaxed SM2 
with the objective g(X+,X~) = f(X+) + f(X~). 

Let X' + C V + and X'~ C V~ be the sets minimizing g() among all feasible 
pairs of sets for the relaxed SM2. Let S* be an optimal set minimizing the 
function /() in the (original) SM2 formulation with S* + and S*~ the copies of 
S* in V + and V~ respectively. Then, 

2/(5*) - f(S*+) + f(S*~) > g(X'+ U X'-) = f(X'+) + f{X'~) 

> f(x'+ u x'-) + f(x'+ n x'-) > f(x'+ u x'-). 

The first inequality holds since X' + L>X'~ is an optimal solution to the relaxed 
SM2. The second inequality follows from the submodularity of the function /. 
f(X' + U X'~) is the value of our solution where an element is included if either 
one of its two copies is in X' + or in X'~ . 

If both nodes x+ , x J are of value 1 and — 1 respectively, then we set the value 
of Xj = 1. Let V 1 be the set of such variables. If both x~j , xj are of value 0, we 
let Xj =0, and V° is the set of these variables. The set of remaining variables 
that have exactly one of x~j , xj of absolute value 1 and the other 0, we call V? . 

Since the binarized problem, which is a 2SAT problem, is equivalent to the 
original SM2 (rather than being a relaxation of it), it is possible to find a feasible 
solution to the respective 2SAT expression by using the linear time algorithm of 
[EIS76]. Let such a feasible solution be z*. The proof of Theorem 3 demonstrates 
that setting xi — Zi for each i G V? is feasible for SM2. 

Theorem 3 ([HMNT93]) If the integer problem on 2 variables per inequality 
has a feasible solution, then the solution to the binarized monotonized constraints 
has a feasible rounding that is found in linear time in the number of constraints. 
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Let Z = {i\ Zl = l,i e Vi}. Since V 1 UV^ = X'+ U X'~ and Z C Vi, it 
follows from the monotonicity of function / that, 

/(V 1 U Z) < f(X'+UX'-). 

Therefore we conclude that f(V 1 UZ) < 2f(S*) thus demonstrating a polynomial 
time 2-approximation algorithm for SM2. 

5.3 Persistency and proof of best possible approximations 

We consider the variables that are in the solution set of the relaxed SM2, V 1 and 
V° to be "integral". The variables that are in V? are considered half integral. 

The proof of Theorem 3 demonstrates that the half integral solution has the 
persistency property. That is, if any variable Xi is integer, meaning it is in V 1 or 
V°, then Xi retains this integer value in an optimal solution. 

In [Hoc97], (page 132) we show that any 2-SAT is equivalent to a vertex 
cover problem. Therefore the impossibility of approximating SM- vertex cover in 
polynomial time within a factor better than 2, unless NP=P, [GKTW09], implies 
that the 2-approximation algorithms given here for SM2 are best possible and 
cannot be improved. 

6 Conclusions 

We demonstrate here best possible 2-approximation algorithms for a large fam- 
ily of submodular optimization that are NP-hard. The technique used for all 
these problems is unified for all the problems as submodular minimization over 
linear constraints with at most two variables per inequality. The running time 
of the algorithms is strongly polynomial time. We further introduce here a new 
submodular optimization problem - the submodular-closure problem which is 
the foundation of all approximation algorithms for the submodular optimization 
over constraints with at most two variables per inequality. The results extend 
to multi-sets in a manner analogous to the rounding of solutions for integer 
problems given e.g. in [Hoc97] Section 3.8.2. This settles, for the first time, the 
approximation and complexity status of a number of submodular minimization 
problems including: SM-2SAT; SM-min satisfiability; SM-edge deletion for clique 
and SM-node deletion for biclique. 
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